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DETAILED ACTION 

1 . Claims 1-22 have been considered. Claim 14 has been amended as per Applicant's 
request. 

Papers Submitted 

2. It is hereby acknowledged that the following papers have been received and placed of 
record in the file: After Final Amendment as received on 30 August 2005. 

Claim Rejections - 35 USC §103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

4. Claims 1-22 are rejected under 35 U.S.C. 103(a) as being unpatentable over Greenley, 
U.S. Patent No. 5,761,469 (herein referred to as Greenley) in view of McGeer, Brayton, 
Sangiovanni-Vincentelli, and Sahni's "Performance Enhancement through the Generalized 
Bypass Transform" from IEEE ©1991 (herein referred to as McGeer). 

5. Regarding claims 1 and 14, taking claim 14 as exemplary, Greenley has taught a 
processing system comprising: 

a. A data processor (Greenley 100 of Fig. 1) comprising: 

i. An instruction execution pipeline comprising N processing stages, each of 
said N processing stages capable of performing one of a plurality of 
execution steps associated with a pending instruction being executed by 
said instruction execution pipeline (Greenley Col.l lines 34-40); 
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ii. A data cache (Greenley 180 of Fig. 1) capable of storing data values used 
by said pending instruction (Greenley Col.l lines 42-43); 

iii. A plurality of registers (150 of Fig. 1) capable of receiving said data values 
from said data cache (Greenley Col.l lines 41-45); 

iv. A load store unit (Greenley 130 of Fig. 1) capable of transferring a first one 
of said data values from said data cache to a target one of said plurality of 
registers during execution of a load operation (Greenley Col.l lines 15-21, 
63-67 and Col.2 lines 1-7, 13-15); 

v. A shifter circuit (Greenley 160,170 of Fig. 1) associated with said load 
store unit capable of one of a) shifting (Greenley Col.2 lines 19-3 1), b) 
sign extending (Greenley Col.2 lines 48-54), and c) zero extending 
(Greenley Col.2 lines 45-47) said first data value prior to loading said first 
data value into said target register; 

b. A memory coupled to said data processor (Greenley Col. 1 lines 41-43); and 

c. A plurality of memory-mapped peripheral circuits coupled to said data processor 
for performing selected functions in association with said data processor 
(Greenley Col. 1 line 29 to Col. 2 line 7 and Col. 2 lines 16-31). In regards to 
Greenley, the ICACHE, prefetch unit, and memory subsystems perform selected 
functions, such as storing instructions, fetching instructions from selected 
locations, accessing words in memory, at and from certain locations from 
memory. 
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6. Greenley has not explicitly taught bypass circuitry associated with said load store unit 
capable of transferring said first data value from said data cache directly to said target register 
without processing said first data value in said shifter circuit. However, Greenley has taught a 
sign extension unit (Greenley 160 of Fig. 1) that performs a function to fill in unoccupied bits of a 
register by extending its sign after it is loaded from the data cache but before it is stored in the 
register file (Greenley Col.2 lines 48-50). McGeer has taught bypassing functions (McGeer page 

184, Col. 2 lines 1 1-25; page 185, Col. 1 to Col. 2; Figure 1; and Figure 2). A person of ordinary 
skill in the art at the time the invention was made would have recognized that bypasses improve 
the performance of a system by minimizing delays from unnecessary functions (McGeer page 

185, Col. 1 to Col. 2). Therefore, it would have been obvious to a person of ordinary skill in the 
art at the time was made to incorporate the bypassing of McGeer in the device of Greenley to 
improve system performance. 

7. Claim 1 is nearly identical to claim 14. Claim 1 differs in its lack of a main memory and 
memory-mapped peripheral circuits, but comprises the same data processor as claim 14, and is 
therefore rejected for the same reasons. 

8. Regarding claims 2 and 15, taking claim 15 as exemplary, Greenley in view of McGeer 
has taught the processing system as set forth in claim 14, wherein said bypass circuitry transfers 
said first data value from said data cache directly to said target register during a load word 
operation (see above rejection of claim 1 and McGeer page 184, Col. 2 lines 1 1-25; page 185, 
Col. 1 to Col. 2). While Greenley has taught a different register size than the applicant (Greenley 
Col.2 lines 17-19), the situation when a register has no unoccupied bits after being loaded with 
data from a data cache remains the same, with the size of the register and word being moot. 
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Therefore Greenley' s loading of a double word has the same consequences as the applicant's 
loading of a word. 

9. Claim 2 is nearly identical to claim 15. Claim 2 differs in its parent claim, but comprises 
the same data processor as claim 15, and is therefore rejected for the same reasons. 

10. Regarding claims 3 and 16, taking claim 16 as exemplary, Greenley in view of McGeer 
has taught the data processor as set forth in claim 15, wherein said bypass circuitry (McGeer 
page 184, Col. 2 lines 1 1-25; page 185, Col. 1 to Col. 2) transfers said first data value from said 
data cache directly to said target register at the end of two machine cycles (Greenley Col.4 lines 
17-20). 

1 1 . Claim 3 is nearly identical to claim 16. Claim 3 differs in its parent claim, but comprises 
the same data processor as claim 16, and is therefore rejected for the same reasons. 

12. Regarding claims 4 and 17, taking claim 17 as exemplary, Greenley in view of McGeer 
has taught the data processor as set forth in claim 14, wherein said shifter circuit one of a) shifts, 
b) sign extends, or c) zero extends said first data value prior to loading said first data value into 
said target register during a load half-word operation (see Greenley Col. 2 lines 17-20, 24-30, 46- 
47). 

13. Claim 4 is nearly identical to claim 17. Claim 4 differs in its parent claim, but comprises 
the same data processor as claim 17, and is therefore rejected for the same reasons. 

14. Regarding claims 5 and 18, taking claim 18 as exemplary, Greenley in view of McGeer 
has taught the data processor as set forth in claim 17, wherein said shifter circuit loads said 
shifted first data value into said target register at the end of two machine cycles (Greenley Col.4 
lines 17-20), but has not explicitly taught the load taking three machine cycles. However, 
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Greenley has taught this two-cycle latency for all load instructions, including those without the 
need for sign extension. In the situation where the load instruction fills the target register 
completely and no sign extension is needed, such as when the data is already properly aligned 
since it is coming from the data cache, which only contains aligned data, or from the ALU, 
which outputs aligned data, the data processor as configured above will execute the load 
instruction at least one cycle faster due to the elimination of the alignment operations (McGeer 
page 184, Col. 2 lines 1 1-25; page 185, Col. 1 to Col. 2). This will create a latency of at least 
one cycle fewer for those load instructions which bypass the sign extension unit, and at least one 
more cycle for those which need sign extension, such as half-word load instructions. Because 
the applicant's claimed load instructions, which take 2 and 3 cycles for bypassed and sign- 
extended instructions, respectively, have no claimed advantages over the 1 and 2 cycles that 
Greenly in view of McGeer have taught, but are merely a change in the magnitude of latency, 
they are considered to be equivalent and thus taught by Greenley in view of McGeer (see In re 
Rose, 220 F.2d 459, 463, 105 USPQ 237, 240 (CCPA 1955)). 

15. Claim 5 is nearly identical to claim 18. Claim 5 differs in its parent claim, but comprises 
the same data processor as claim 18, and is therefore rejected for the same reasons. 

16. Regarding claims 6 and 19, taking claim 19 as exemplary, Greenley in view of McGeer 
has taught the data processor as set forth in claim 14, wherein said shifter circuit one of a) shifts, 
b) sign extends, and c) zero extends said first data value prior to loading said first data value into 
said target register during a load byte operation (see Greenley Col. 2 lines 17-20, 36-40, 46-47). 

17. Claim 6 is nearly identical to claim 19. Claim 6 differs in its parent claim, but comprises 
the same data processor as claim 19, and is therefore rejected for the same reasons. 
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18. Regarding claims 7 and 20, taking claim 20 as exemplary, Greenley in view of McGeer 
has taught the data processor as set forth in claim 6, wherein said shifter circuit loads said shifted 
first data value into said target register at the end of two machine cycles (Greenley Col.4 lines 
17-20), but has not explicitly taught the transfer taking three machine cycles. However, 
Greenley has taught this two-cycle latency for all load instructions, including those without the 
need for sign extension. In the situation where the load instruction fills the target register 
completely and no sign extension is needed, such as when the data is already properly aligned 
since it is coming from the data cache, which only contains aligned data, or from the ALU, 
which outputs aligned data, the data processor as configured above will execute the load 
instruction at least one cycle faster due to the elimination of the alignment operations (McGeer 
page 184, Col. 2 lines 1 1-25; page 185, Col. 1 to Col. 2). This will create a latency of at least 
one cycle fewer for those load instructions which bypass the sign extension unit, and at least one 
more cycle for those which need sign extension, such as half-word load instructions. Because 
the applicant's claimed load instructions, which take 2 and 3 cycles for bypassed and sign- 
extended instructions, respectively, have no claimed advantages over the 1 and 2 cycles that 
Greenly in view of McGeer have taught, but are merely a change in the magnitude of latency, 
they are considered to be equivalent and thus taught by Greenley in view of McGeer (see In re 
Rose, 220 F.2d 459, 463, 105 USPQ 237, 240 (CCPA 1955)). 

19. Claim 7 is nearly identical to claim 20. Claim 7 differs in its parent claim, but comprises 
the same data processor as claim 20, and is therefore rejected for the same reasons. 
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20. Regarding claims 8, 9, 21, and 22, taking claims 21 and 22 as exemplary, Greenley in 
view of McGeer has taught the data processor as set forth in claim 14, but Greenley has not 
explicitly taught 

a. Wherein said bypass circuitry comprises a multiplexer having a first input channel 
coupled to a data output of said data cache; and 

b. Wherein said multiplexer has a second input channel coupled to an output of said 
shifter circuit. 

21 . However, Greenley has taught a sign extension unit (Greenley 160 of Fig. 1) that fills in 
unoccupied bits of a register by extending its sign after it is loaded from the data cache but 
before reaching a register in the register file (Greenley Col.2 lines 48-50). McGeer has taught 

a. Wherein said bypass circuitry comprises a multiplexer having a first input channel 
(McGeer page 184, Col. 2 lines 1 1-25; page 185, Col. 1 to Col. 2); and 

b. Wherein said multiplexer has a second input channel coupled to an output of 
another device (McGeer page 184, Col. 2 lines 11-25; page 185, Col. 1 to Col. 2). 

22. A person of ordinary skill in the art at the time the invention was made would have 
recognized that bypasses improve the performance of a system by minimizing delays from 
unnecessary functions (McGeer page 185, Col. 1 to Col. 2). Therefore, it would have been 
obvious to a person of ordinary skill in the art at the time was made to incorporate the bypassing 
of McGeer in the device of Greenley to improve system performance. 

23. Claims 8 and 9 are nearly identical to claims 21 and 22 respectively. Claims 8 and 9 
differ in its parent claim, but comprises the same data processor as claims 21 and 22, and is 
therefore rejected for the same reasons. 
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24. Regarding claim 10, Greenley has taught for use in a processor comprising an N-stage 
execution pipeline (Greenley Coll lines 34-40), a data cache (Greenley 180 of Fig. 1), and a 
plurality of registers (Greenley 150 of Fig. 1), a method of loading a first data value from the data 
cache into a target one of the registers, the method comprising the steps of: 

a. Determining if a pending instruction in the execution pipeline is one of a load 
word operation, a load half-word operation, and a load byte operation (Greenley 
Col.l lines 63-67, Col. 2 lines 1-7, 17-19 and Col. 5 lines 13-24). 

b. In response to a determination that the pending instruction is a load half-word 
operation, transferring the first data value from the data cache to a shifter circuit 
and shifting the first data value prior to loading the first data value into the target 
register (Greenley Col.2 lines 24-31), 

c. In response to a determination that the pending instruction is a load byte 
operation, transferring the first data value from the data cache to a shifter circuit 
and shifting the first data value prior to loading the first data value into the target 
register (Greenley Col.2 lines 35-40). 

25. Greenley has not explicitly taught where in response to a determination that the pending 
instruction is a load word operation, transferring the first data value from the data cache directly 
to the target register without processing the first data value in the shifter circuit. However, 
Greenley has taught a sign extension unit (Greenley 160 of Fig. 1) that fills in unoccupied bits of 
a register by extending its sign after it is loaded from the data cache but before it is stored in a 
certain register in the register file (Greenley Col.2 lines 48-50). McGeer has taught bypassing 
functions (McGeer page 184, Col. 2 lines 1 1-25; page 185, Col. 1 to Col. 2). A person of 
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ordinary skill in the art at the time the invention was made would have recognized that bypasses 
improve the performance of a system by minimizing delays from unnecessary functions (McGeer 
page 185, Col. 1 to Col 2). Therefore, it would have been obvious to a person of ordinary skill 
in the art at the time was made to incorporate the bypassing of McGeer in the device of Greenley 
to improve system performance. 

26. Regarding claim 1 1, Greenley in view of McGeer has taught the method as set forth in 
claim 10, wherein the step of transferring the first data value requires two machine cycles during 
a load word operation (Greenley Col. 4 lines 17-20). While Greenley has taught a different 
register size than the applicant (Greenley Col.2 lines 17-19), the situation when a register has no 
unoccupied bits after being loaded with data from a data cache remains the same, with the size of 
the register and word being moot. Therefore Greenley' s loading of a double word has the same 
consequences as the applicant's loading of a word {In re Rose, 220 F.2d 459, 463, 105 USPQ 
237, 240 (CCPA 1955)). 

27. Regarding claim 12, Greenley in view of McGeer has taught the method as set forth in 
claim 10, wherein the step of transferring the first data value requires two machine cycles during 
a load half-word operation (Greenley Col.4 lines 17-20), but has not explicitly taught the transfer 
taking three machine cycles. However, Greenley has taught this two-cycle latency for all load 
instructions, including those without the need for sign extension. In the situation where the load 
instruction fills the target register completely and no sign extension is needed, such as when the 
data is already properly aligned since it is coming from the data cache, which only contains 
aligned data, or from the ALU, which outputs aligned data, the data processor as configured 
above will execute the load instruction at least one cycle faster due to the elimination of the 
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alignment operations (McGeer page 184, Col. 2 lines 1 1-25; page 185, Col. 1 to Col. 2). This 
will create a latency of at least one cycle fewer for those load instructions which bypass the sign 
extension unit, and at least one more cycle for those which need sign extension, such as half- 
word load instructions. Because the applicant's claimed load instructions, which take 2 and 3 
cycles for bypassed and sign-extended instructions, respectively, have no claimed advantages 
over the 1 and 2 cycles that Greenly in view of McGeer have taught, but are merely a change in 
the magnitude of latency, they are considered to be equivalent and thus taught by Greenley in 
view of McGeer (see In re Rose, 220 F.2d 459, 463, 105 USPQ 237, 240 (CCPA 1955)). 
28. Regarding claim 13, Greenley in view of McGeer has taught the method as set forth in 
claim 10 wherein the step of transferring the first data value requires two machine cycles during 
a load byte operation (Greenley Col.4 lines 17-20), but has not explicitly taught the transfer 
taking three machine cycles. However, Greenley has taught this two-cycle latency for all load 
instructions, including those without the need for sign extension. In the situation where the load 
instruction fills the target register completely and no sign extension is needed, such as when the 
data is already properly aligned since it is coming from the data cache, which only contains 
aligned data, or from the ALU, which outputs aligned data, the data processor as configured 
above will execute the load instruction at least one cycle faster due to the elimination of the 
alignment operations (McGeer page 184, Col. 2 lines 11-25; page 185, Col. 1 to Col. 2), This 
will create a latency of at least one cycle fewer for those load instructions which bypass the sign 
extension unit, and at least one more cycle for those which need sign extension, such as half- 
word load instructions. Because the applicant's claimed load instructions, which take 2 and 3 
cycles for bypassed and sign-extended instructions, respectively, have no claimed advantages 
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over the 1 and 2 cycles that Greenly in view of McGeer have taught, but are merely a change in 
the magnitude of latency, they are considered to be equivalent and thus taught by Greenley in 
view of McGeer (see In re Rose, 220 F.2d 459, 463, 105 USPQ 237, 240 (CCPA 1955)). 

Response to Arguments 

29. Applicant's arguments, see After Final, filed 30 August 2005, with respect to the 
rejection(s) of claim(s) 1-22 under Greenley in view of Zaidi have been fully considered and are 
persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, 
a new ground(s) of rejection is made in view of the above rejection. 

Conclusion 

30. The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure as follows. Applicant is reminded that in amending in response to a rejection of 
claims, the patentable novelty must be clearly shown in view of the state of the art disclosed by 
the references cited and the objections made. Applicant must also show how the amendments 
avoid such references and objections. See 37 CFR §1.1 1 1(c). 

a. Berman, Hathaway, LaPaugh, and Trevillyan's "Efficient Techniques for Timing 
Correction" from IEEE ©1990 has taught a bypassing mechanism. 

3 1 . Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Aimee J. Li whose telephone number is (571) 272-4169. The 
examiner can normally be reached on M-T 7:30am-5:00pm. 

32. If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Eddie Chan can be reached on (571) 272-4162. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 
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33. Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 

AJL 

Aimee J. Li 

20 September 2005 




